Online PCA for Contaminated Data
نویسندگان
چکیده
We consider the online Principal Component Analysis (PCA) where contaminated samples (containing outliers) are revealed sequentially to the Principal Components (PCs) estimator. Due to their sensitiveness to outliers, previous online PCA algorithms fail in this case and their results can be arbitrarily skewed by the outliers. Here we propose the online robust PCA algorithm, which is able to improve the PCs estimation upon an initial one steadily, even when faced with a constant fraction of outliers. We show that the final result of the proposed online RPCA has an acceptable degradation from the optimum. Actually, under mild conditions, online RPCA achieves the maximal robustness with a 50% breakdown point. Moreover, online RPCA is shown to be efficient for both storage and computation, since it need not re-explore the previous samples as in traditional robust PCA algorithms. This endows online RPCA with scalability for large scale data.
منابع مشابه
Fast Inference of Contaminated Data for Real Time Object Tracking
The online object tracking is a challenging problem because any useful approach must handle various nuisances including illumination changes and occlusions. Though a lot of work focus on observation models by employing sophisticated approaches for contaminated data, they commonly assume that the samples for updating observation model are uncorrupted or can be restored in updating. For instance,...
متن کاملOnline PCA for Contaminated Data Supplementary Material
Before proving the theoretical results in this paper, we first present following lemmas used in the proof. Lemma 1. There exists a constant c that only depends on µ and d, such that for all γ > 0 and b signals {x i } b i=1 , the following holds with high probability: sup w∈S d
متن کاملPrincipal Component Analysis with Contaminated Data: The High Dimensional Case
We consider the dimensionality-reduction problem (finding a subspace approximation of observed data) for contaminated data in the high dimensional regime, where the number of observations is of the same magnitude as the number of variables of each observation, and the data set contains some (arbitrarily) corrupted observations. We propose a High-dimensional Robust Principal Component Analysis (...
متن کاملPrincipal Component Analysis with Contaminated Data: The High Dimensional Case
We consider the dimensionality-reduction problem (finding a subspace approximation of observed data) for contaminated data in the high dimensional regime, where the the number of observations is of the same magnitude as the number of variables of each observation, and the data set contains some (arbitrarily) corrupted observations. We propose a High-dimensional Robust Principal Component Analys...
متن کاملRobust state estimation in power systems using pre-filtering measurement data
State estimation is the foundation of any control and decision making in power networks. The first requirement for a secure network is a precise and safe state estimator in order to make decisions based on accurate knowledge of the network status. This paper introduces a new estimator which is able to detect bad data with few calculations without need for repetitions and estimation residual cal...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013